statistics Statistics (from German language, German: ''wikt:Statistik#German, Statistik'', "description of a State (polity), state, a country") is the discipline that concerns the collection, organization, analysis, interpretation, and presentation of ...

, unit-weighted regression is a simplified and

robust Robustness is the property of being strong and healthy in constitution. When it is transposed into a system, it refers to the ability of tolerating perturbations that might affect the system’s functional body. In the same line ''robustness'' ca ...

version ( Wainer & Thissen, 1976) of

multiple regression In statistical modeling, regression analysis is a set of statistical processes for Estimation theory, estimating the relationships between a dependent variable (often called the 'outcome' or 'response' variable, or a 'label' in machine learning ...

analysis where only the intercept term is estimated. That is, it fits a model :

\hat = \hat(\mathbf) = \hat + \sum_i x_i

where each of the

x_i

are binary variables, perhaps multiplied with an arbitrary weight. Contrast this with the more common multiple regression model, where each predictor has its own estimated coefficient: :

\hat = \hat(\mathbf) = \hat + \sum_i \hat_i x_i

In the

social science Social science is one of the branches of science, devoted to the study of societies and the relationships among individuals within those societies. The term was formerly used to refer to the field of sociology, the original "science of soc ...

s, unit-weighted regression is sometimes used for binary

classification Classification is a process related to categorization, the process in which ideas and objects are recognized, differentiated and understood. Classification is the grouping of related facts into classes. It may also refer to: Business, organizat ...

, i.e. to predict a yes-no answer where

\hat < 0

indicates "no",

\hat \ge 0

"yes". It is easier to interpret than multiple linear regression (known as

linear discriminant analysis Linear discriminant analysis (LDA), normal discriminant analysis (NDA), or discriminant function analysis is a generalization of Fisher's linear discriminant, a method used in statistics and other fields, to find a linear combination of features ...

in the classification case).

Unit weights

Unit-weighted regression is a method of

robust regression In robust statistics, robust regression seeks to overcome some limitations of traditional regression analysis. A regression analysis models the relationship between one or more independent variables and a dependent variable. Standard types of reg ...

that proceeds in three steps. First, predictors for the outcome of interest are selected; ideally, there should be good empirical or theoretical reasons for the selection. Second, the predictors are converted to a standard form. Finally, the predictors are added together, and this sum is called the variate, which is used as the predictor of the outcome.

Burgess method

The Burgess method was first presented by the sociologist

Ernest W. Burgess Ernest Watson Burgess (May 16, 1886 – December 27, 1966) was a Canadian-American urban sociologist born in Tilbury, Ontario. He was educated at Kingfisher College in Oklahoma and continued graduate studies in sociology at the University of Ch ...

in a 1928 study to determine success or failure of inmates placed on parole. First, he selected 21 variables believed to be associated with parole success. Next, he converted each predictor to the standard form of zero or one (Burgess, 1928). When predictors had two values, the value associated with the target outcome was coded as one. Burgess selected success on parole as the target outcome, so a predictor such as a ''history of theft'' was coded as "yes" = 0 and "no" = 1. These coded values were then added to create a predictor score, so that higher scores predicted a better chance of success. The scores could possibly range from zero (no predictors of success) to 21 (all 21 predictors scored as predicting success). For predictors with more than two values, the Burgess method selects a cutoff score based on subjective judgment. As an example, a study using the Burgess method (Gottfredson & Snyder, 2005) selected as one predictor the number of complaints for delinquent behavior. With failure on parole as the target outcome, the number of complaints was coded as follows: "zero to two complaints" = 0, and "three or more complaints" = 1 (Gottfredson & Snyder, 2005. p. 18).

Kerby method

The Kerby method is similar to the Burgess method, but differs in two ways. First, while the Burgess method uses subjective judgment to select a cutoff score for a multi-valued predictor with a binary outcome, the Kerby method uses classification and regression tree (

CART A cart or dray (Australia and New Zealand) is a vehicle designed for transport, using two wheels and normally pulled by one or a pair of draught animals. A handcart is pulled or pushed by one or more people. It is different from the flatbed tr ...

) analysis. In this way, the selection of the cutoff score is based not on subjective judgment, but on a statistical criterion, such as the point where the chi-square value is a maximum. The second difference is that while the Burgess method is applied to a binary outcome, the Kerby method can apply to a multi-valued outcome, because CART analysis can identify cutoff scores in such cases, using a criterion such as the point where the t-value is a maximum. Because CART analysis is not only binary, but also recursive, the result can be that a predictor variable will be divided again, yielding two cutoff scores. The standard form for each predictor is that a score of one is added when CART analysis creates a partition. One study (Kerby, 2003) selected as predictors the five traits of the

Big five personality traits The Big Five personality traits is a suggested taxonomy, or grouping, for personality traits, developed from the 1980s onward in psychological trait theory. Starting in the 1990s, the theory identified five factors by labels, for the US English ...

, predicting a multi-valued measure of

suicidal ideation Suicidal ideation, or suicidal thoughts, means having thoughts, ideas, or ruminations about the possibility of ending one's own life.World Health Organization, ''ICD-11 for Mortality and Morbidity Statistics'', ver. 09/2020MB26.A Suicidal ideatio ...

. Next, the personality scores were converted into standard form with CART analysis. When the CART analysis yielded one partition, the result was like the Burgess method in that the predictor was coded as either zero or one. But for the measure of neuroticism, the result was two cutoff scores. Because higher neuroticism scores correlated with more suicidal thinking, the two cutoff scores led to the following coding: "low Neuroticism" = 0, "moderate Neuroticism" = 1, "high Neuroticism" = 2 (Kerby, 2003).

''z''-score method

Another method can be applied when the predictors are measured on a continuous scale. In such a case, each predictor can be converted into a

standard score In statistics, the standard score is the number of standard deviations by which the value of a raw score (i.e., an observed value or data point) is above or below the mean value of what is being observed or measured. Raw scores above the mean ...

, or ''z''-score, so that all the predictors have a mean of zero and a standard deviation of one. With this method of unit-weighted regression, the variate is a sum of the ''z''-scores (e.g., Dawes, 1979; Bobko, Roth, & Buster, 2007).

Literature review

The first empirical study using unit-weighted regression is widely considered to be a 1928 study by sociologist

. He used 21 variables to predict parole success or failure, and the results suggest that unit weights are a useful tool in making decisions about which inmates to parole. Of those inmates with the best scores, 98% did in fact succeed on parole; and of those with the worst scores, only 24% did in fact succeed (Burgess, 1928). The mathematical issues involved in unit-weighted regression were first discussed in 1938 by

Samuel Stanley Wilks Samuel Stanley Wilks (June 17, 1906 – March 7, 1964) was an American mathematician and academic who played an important role in the development of mathematical statistics, especially in regard to practical applications. Early life and edu ...

, a leading statistician who had a special interest in

multivariate analysis Multivariate statistics is a subdivision of statistics encompassing the simultaneous observation and analysis of more than one outcome variable. Multivariate statistics concerns understanding the different aims and background of each of the dif ...

. Wilks described how unit weights could be used in practical settings, when data were not available to estimate beta weights. For example, a small college may want to select good students for admission. But the school may have no money to gather data and conduct a standard multiple regression analysis. In this case, the school could use several predictors—high school grades, SAT scores, teacher ratings. Wilks (1938) showed mathematically why unit weights should work well in practice. Frank Schmidt (1971) conducted a simulation study of unit weights. His results showed that Wilks was indeed correct and that unit weights tend to perform well in simulations of practical studies.

Robyn Dawes Robyn Mason Dawes (July 23, 1936 – December 14, 2010) was an American psychologist who specialized in the field of human judgment. His research interests included human irrationality, human cooperation, intuitive expertise, and the United State ...

(1979) discussed the use of unit weights in applied studies, referring to the robust beauty of unit weighted models. Jacob Cohen also discussed the value of unit weights and noted their practical utility. Indeed, he wrote, "As a practical matter, most of the time, we are better off using unit weights" (Cohen, 1990, p. 1306). Dave Kerby (2003) showed that unit weights compare well with standard regression, doing so with a cross validation study—that is, he derived beta weights in one sample and applied them to a second sample. The outcome of interest was suicidal thinking, and the predictor variables were broad personality traits. In the cross validation sample, the correlation between personality and suicidal thinking was slightly stronger with unit-weighted regression (''r'' = .48) than with standard multiple regression (''r'' = .47). Gottfredson and Snyder (2005) compared the Burgess method of unit-weighted regression to other methods, with a construction sample of N = 1,924 and a cross-validation sample of N = 7,552. Using the Pearson point-biserial, the effect size in the cross validation sample for the unit-weights model was ''r'' = .392, which was somewhat larger than for logistic regression (''r'' = .368) and predictive attribute analysis (''r'' = .387), and less than multiple regression only in the third decimal place (''r'' = .397). In a review of the literature on unit weights, Bobko, Roth, and Buster (2007) noted that "unit weights and regression weights perform similarly in terms of the magnitude of cross-validated multiple correlation, and empirical studies have confirmed this result across several decades" (p. 693). Andreas Graefe applied an equal weighting approach to nine established multiple regression models for forecasting U.S. presidential elections. Across the ten elections from 1976 to 2012, equally weighted predictors reduced the forecast error of the original regression models on average by four percent. An equal-weights model that includes all variables provided calibrated forecasts that reduced the error of the most accurate regression model by 29% percent.

Example

An example may clarify how unit weights can be useful in practice. Brenna Bry and colleagues (1982) addressed the question of what causes drug use in adolescents. Previous research had made use of multiple regression; with this method, it is natural to look for the best predictor, the one with the highest beta weight. Bry and colleagues noted that one previous study had found that early use of alcohol was the best predictor. Another study had found that alienation from parents was the best predictor. Still another study had found that low grades in school were the best predictor. The failure to replicate was clearly a problem, a problem that could be caused by bouncing betas. Bry and colleagues suggested a different approach: instead of looking for the best predictor, they looked at the number of predictors. In other words, they gave a unit weight to each predictor. Their study had six predictors: 1) low grades in school, 2) lack of affiliation with religion, 3) early age of alcohol use, 4) psychological distress, 5) low self-esteem, and 6) alienation from parents. To convert the predictors to standard form, each risk factor was scored as absent (scored as zero) or present (scored as one). For example, the coding for low grades in school were as follows: "C or higher" = 0, "D or F" = 1. The results showed that the number of risk factors was a good predictor of drug use: adolescents with more risk factors were more likely to use drugs. The model used by Bry and colleagues was that drug users do not differ in any special way from non-drug users. Rather, they differ in the number of problems they must face. "The number of factors an individual must cope with is more important than exactly what those factors are" (p. 277). Given this model, unit-weighted regression is an appropriate method of analysis.

Beta weights

In standard multiple regression, each predictor is multiplied by a number that is called the ''beta weight'', ''regression weight'' or ''weighted regression coefficients'' (denoted β_W or BW). The prediction is obtained by adding these products along with a constant. When the weights are chosen to give the best prediction by some criterion, the model referred to as a proper linear model. Therefore, multiple regression is a proper linear model. By contrast, unit-weighted regression is called an improper linear model.

Model specification

Standard multiple regression hinges on the assumption that all relevant predictors of the outcome are included in the regression model. This assumption is called model specification. A model is said to be specified when all relevant predictors are included in the model, and all irrelevant predictors are excluded from the model. In practical settings, it is rare for a study to be able to determine all relevant predictors a priori. In this case, models are not specified and the estimates for the beta weights suffer from omitted variable bias. That is, the beta weights may change from one sample to the next, a situation sometimes called the problem of the bouncing betas. It is this problem with bouncing betas that makes unit-weighted regression a useful method.

References

*Bobko, P., Roth, P. L., & Buster, M. A. (2007). "The usefulness of unit weights in creating composite scores: A literature review, application to content validity, and meta-analysis". ''Organizational Research Methods'', volume 10, pages 689-709. * *Burgess, E. W. (1928). "Factors determining success or failure on parole". In A. A. Bruce (Ed.), ''The Workings of the Indeterminate Sentence Law and Parole in Illinois'' (pp. 205–249). Springfield, Illinois: Illinois State Parole Board
Google books
*Cohen, Jacob. (1990). "Things I have learned (so far)". ''American Psychologist'', volume 45, pages 1304-1312. *Dawes, Robyn M. (1979). "The robust beauty of improper linear models in decision making". ''American Psychologist'', volume 34, pages 571-582. .
archived pdf
* Gottfredson, D. M., & Snyder, H. N. (July 2005). ''The mathematics of risk classification: Changing data into valid instruments for juvenile courts''. Pittsburgh, Penn.: National Center for Juvenile Justice. NCJ 209158
Eric.ed.gov pdf
*Kerby, Dave S. (2003). "CART analysis with unit-weighted regression to predict suicidal ideation from Big Five traits". ''Personality and Individual Differences'', volume 35, pages 249-261. *Schmidt, Frank L. (1971). "The relative efficiency of regression and simple unit predictor weights in applied differential psychology". ''Educational and Psychological Measurement'', volume 31, pages 699-714. *Wainer, H., & Thissen, D. (1976). Three steps toward robust regression. ''Psychometrika'', volume 41(1), pages 9–34. *

External links

Chis Stucchio blog
- Why a pro/con list is 75% as good as your fancy

machine learning Machine learning (ML) is a field of inquiry devoted to understanding and building methods that 'learn', that is, methods that leverage data to improve performance on some set of tasks. It is seen as a part of artificial intelligence. Machine ...

algorithm Regression analysis